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DECLARATION UNDER 37 C S L131 

Sir: 

We, Peter J. Olandt, Rachel E. Meyers, and Katherine M. Galvin hereby declare and state: 

1. In the United States, the conception of the sequence of the human 33945 molecules of the invention 
and the identification of the 33945 polypeptide as a glycosyltransferase occurred prior to December 
15, 2000 and the reduction to practice comprising obtaining the final sequence known as SEQ ID 
N0:1 in the above-identified application was performed with due diligence until December 18, 
2000, the date of the actual reduction to practice. 

2. Evidence of conception prior to December 15, 2000 is provided in Exhibits A1-A3, which are 
copies of electronic printouts of a map of clones contributing to the 33945 nucleotide sequence and 
analyses of early 33945 sequences. 

Exhibit Al is a copy of page 1 of a Sequencher^*^ map identifying the clones contributing 
to the 33945 nucleotide sequence, the clone sizes and the positions of the clones relative to the 
33945 sequence known at that stage of the invention process. Exhibit A2 is a copy of a BLAST 
analysis of a translation of that nucleotide sequence. The map was compiled and the analysis was 
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performed prior to December 15, 2000. By that time, the sequence was extensive, spanning 2109 
nucleotides, and the BLAST revealed similarity of the 33945 polypeptide to glycosyltransferases. 
Exhibit A3 is a copy of a series of analyses performed on the polypeptide encoded by that 33945 
nucleotide sequence. Page 1 of this printout bears the nearly complete polypeptide sequence known 
at the time, showing that it has the full length of 581 amino acids, but a few uncertain residues; page 
3 bears the results of a Pfam analysis which matched a portion of the 33945 sequence with the Pfam 
Glycosyl transferase domain consensus sequence; pages 4 and 5 bear the results of an analysis 
which matched portions of the 33945 polypeptide sequence with glycosyltransferase domain 
consensus sequences from the ProDom database. The combined result of the analyses was the 
determination that the 33945 molecules of the invention represent a glycosyltransferase. 

The original printouts in Exhibits A1-A3 bear the automatically embedded dates on which 
the analyses were performed. In accordance with accepted practice, the dates on the copies of the 
electronic printouts have been masked (M.P.E.P. § 715.07). 

3. Evidence of the exercise of due diligence in the process of reducing to practice the 33945 molecules 
of the invention is provided in Exhibits B1-B5. In accordance with M.P.E.P. § 715.07, the actual 
dates of the acts portrayed in Exhibits B1-B5 have been provided to establish diligence. In 
accordance with M.P.E.P. § 7 15, 07(a), the acts performed just prior to the eflFective date of 
December 15, 2000 until the December 18, 2000 date of the actual reduction to practice are 
included in Exhibits B1-B5. 

Exhibit Bl is a copy of page 1 of an updated Sequencher™ map compiling the clones 
contributing to the 33945 nucleotide sequence as understood by November 27, 2000. One can see 
from this Exhibit, additional 5 'clones "ft)hX33945phg01bl.abi" and fbhX33945phh01bl.abi" 
which were not present on Exhibit Al. In addition, Exhibit Bl has a note written by inventor Peter 
Olandt, describing a 2 base pair problem needing to be solved. In order to solve this problem, 
additional clones were prepared to cover the region in question. This clone preparation process 
yielded four additional 5 ' clones, "fbhX33945peb04h 1 "fbhX33945pee03gl 
"fbhX33945pfd04hl" and "fbhX33945pfg03gl." 

Clone fbhX33945pee03gl is used herein as an example of the timecourse and types of 
analyses performed on these clones to show due diligence. Exhibit B2 provides a summary of the 
facts related to clone fbhX33945pee03gl, together with its nucleotide sequence. At the top of 
Exhibit B2, one can see that this clone was submitted for sequencing on December 12, 2000. As 
seen in the middle of the Exhibit, fbhX33945pee03gl came out of sequencing on December 14, 
2000 and was submitted for analyses. The first analysis was performed on December 14, 2000, and 
subsequent analyses were performed on December 15 and 16, 2000. 

Exhibit B3 shows that on Monday, December 18, 2000, the four new clones were 
assembled into a new Sequencher™ clone map. The problem of base pair selection was solved and 
the complete 33945 nucleic acid sequence ("Fbh33945FL"), known in the application as SEQ ID 
N0:1 was finalized and submitted to the Millennium database on Monday, December 18, 2000, as 
shown on Exhibit B4. Exhibit B5, also performed on December 18, 2000, shows that analyses 
performed on the polypeptide encoded by the complete nucleotide sequence supported the earlier 
conclusion of 33945 as a glycosyltransferase drawn from the evidence of prior conception provided 
in Exhibits A2 and A3. 



We hereby declare that all statements made herein of our own knowledge are true and that all 
statements made on information and belief are believed to be true; and further that these statements were 
made with the knowledge that willful false statements and the like so made are punishable by fine or 
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imprisonment, or both, under Section 1001 of Title 18 of the United States Code, and that such willfiil false 
statements may jeopardize the validity of the application or any patent issued thereon. 



Peter J.Olandt Date 



Rachel E. Meyers Date 



katherine M. Gaivtn ^ Date 
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33945 

Seguenchei:^" ^^33945" 



|bhX33945pQbb02b1 .ab i , 1 to 371 
fbhX33945pgba01 a1 .abi, 1 to 373 
fbhX33945pgba02b1.abi^ 1 to 380 

f bhX33945pgbb01a1.ab i , 13 to 373 
AL136084.nt|GENSC AN _pred icted_C. 221 to 569 
AI863865 in DBEst, 226 to 727 

» rnm i==i » 

AA493187 in DBEst. 261 to 593 



AA807( 



y429394 in DBEst. 265 to 72 7 
Al j300923 In DBEst. 269 to 822 
In DBEst. 3| 4 to 574 
AA83604 6 in DBEst. 51 8 to 737 
AL1 36084.nt|GEN SCAN_predic ted„C. 570 to 760 

/ y^4010S3 in DBEst. 620 to 109 6 
AA429393 In DBEst. 623 to 109 6 
AL1 36084 nt|GENSCAN_predioted^C. 758 to 1241 
BE1 67242 in DBEst. 962 to 1138 



AW843782 in D BEst - Import - c. 1104 to 1329 



AW814059 in DBEst 
>- 



lmport_^ c, 1175 to 1395 



john h2Q4g03t1.abi. 1197 to 1 591 
ithA a158a12t1.abi, 1203 to 16 00 
AL1 3608 4.ntlGENSCAN_predicted_C. 1 241 to 1774 
iThzc11S7a 07t1.abi. 1g 57 to 1425 

cfahneOOl hOTjotl .abi. 1 308 to 1552 

i ohne001h07t1.abi. 1308 to 1803 
AC007800.nt |GENSCAN _predicted_C. 1 371 to 1774 
cMh gad053c04a1.abi. 1495 to 1 91 6 

cMhv f090g07a1.abi. 1600 to^ 1957 
AI638649 tn DBEst 



AA554045 in DBEst^Jmport 



Import ' c. 1608 to 2063 
c, 1633 to 2067 



AI916034 in DBEst -^Import > c. 1664 to 2067 
jlhbaa0 33c02t1 abi. 1667 to 2067 
johnd066h12tl.abi. 1701 to 2031 



AI636959 in DBEst 
AA994913 in DBEst 



I mport - 



c. 1701 to 2076 



I mport - c. 1706 to 2063 
ootivBA001e10a1. abi. 1801 |o 1957 



220 



376 



620 730 



1 I in I 



962 1,098 



1.334 



111 I IIF F 

9? f 



I I III II 



1,600 1,776 

y , ?9 — 



2,076 



4=f 



-9-*- 



4= 



jXiaqram Kew 



I Hole In oontig 
I Single fragment 

I Multiple fragments same direction 
I Both strandr 
I ^th Strang plus 



Bumps on j 
fragments j 
show motifs, | 
hoUov j 
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Exhibit A2 to Accompany Declaration under 
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33945 (azialysis only) (2109 bases) - 

WU2 BLAST vs. PROT - Selected Database Hits 

>gi|2121220|jrb|AAB5830l| (U73819) polypeptide GalNAc trans ferase-T4 (Mus 
musculus 1 

Length = 578 * 

Plus Strand HSPs: 

Score = 1654 (587.3 bits), Expect = 3.7e-169, P = 3.7e-169 
Identities = 328/570 (57%), Positives = 405/570 (71%), Frame = +2 

Query: 128 VLLALLALAGL GSVLRAQRGAGAGAAEPGPPRTPRPGRRE PVMPRPPVPA 277 

+LLALL LA + SLA GAG GA E GP R P RE P+ +PP + 

Sbjct: 13 LLLALLTLAYILVEFSVSTLYASPGAG-GARELGPRRLPDLDTREEDLSQPLYIKPPADS 71 

Query: 278 NALGARGEAVRLQLQGEELRLQEESVRLHQINIYLSDRISLHRRLPXRWNPLCKEKKYDY 457 

+ALG G A +LQL EL+ QEE + + INIYLSDRISLHR + + CK KK+ Y 
Sbjct: 72 HALGEWGRASKLQLNEGELKQQEELIERYAINIYLSDRISLHRHIEDKRMYECKAKKFHY 131 

Query: 458 DNLPRTSVIIAFYNEAWSTLLRTVYSVLETSPDILLEEVILVDDYSDREHLKERLANELS 637 

+LP TSVIIAFYNEAWSTLLRT++SVLETSP +LL+E+ILVDD SDR +LK +L +S 
Sbjct: 132 RSLPTTSVIIAFYNEAWSTLLRTIHSVLETSPAVLLKEIILVDDLSDRIYLKAQLETYIS 191 

Query; 638 GLPKVRLIRANKREGLVRARLLGASAARGDVLTFLDCHCECHEGWLEPLLQRIHEEESAV 817 

L +VRLIR NKREGLVRARL+GA+ A GDVLTFLDCHCEC+ GWLEPLL+RI +E+A+ 
Sbjct: 192 NLERVRLIRTNKREGLVRARLIGATFATGDVLTFLDCHCECNTGWLEPLLERISRDETAI 251 

Query: 818 VCPVIDVIDWNTFEYLGNSGEPQIGGFDWRLWTWHTVPERERIRMQSPVDVIRSPTMAG 997 

VCPVID IDWNTFE+ +GEP IGGFDWRL F WH+VP+ ER R S +D IRSPTMAG 
Sbjct: 252 VCPVIDTIDWNTFEFYMQTGEPMIGGFDWRLTFQWHSVPKHERDRRTSRIDPIRSPTMAG 311 

Query: 998 GLFAVSKKYFEYLGSYDTGMEVWGGENLEFSFRIWQCGGVLETHPCSHVGHVFRKQAPYS 1177 

GLFAVSKKYF+YLG+YDTGMEVWGGENLE SFR+WQCGG LE HPCSHVGHVF K+APY+ 
Sbjct: 312 GLFAVSKKYFQYLGTYDTGME\AflKX3ENLELSFRVWQCGGKLEIHPCSHVGHVFPKRAPYA 371 

Query: 1178 RNKALANSVXAAEVVmDEFKELYYHRNPRARLEPFGDWERKQLRDKLQCIODFKWFLEW 1357 

R L N+ AAEVWMDE+KE +Y+RNP AR E +GD++ERK LR++L+CK F W+L+ V 
Sbjct: 372 RPNFLQNTARAAEVWMDEYKEHFYNRNPPARKEAYGDLSERKLLRERLKCKSFDWYLKJ^ 431 

Query: 1358 YPELHVPEDRPGFFGMLQNKGLTDYCFDYNPPDENQIVGHQVILYLCHGMGQNQFFEYTS 1537 

+ LHVPEDRPG+ G +++ G++ C DYN PD N G + L+ CHG G NQFFEYTS 
Sbjct: 432 FSNLHVPEDRPGWHGAIRSMGISSECLDYNAPDNNP-TGANLSLFGCHGQGGNQFFEYTS 490 



Query: 1538 QKEIRYNTHQPEGCIAVEAGMDTLIMHLCEETA- — PENQKFILQEDGSLFHEQSKKCVQ 1708 

KEIR+N+ ECV D + MCt PN + +EDG++FH ++ C+ 

Sbjct: 491 NKEIRFNS--WEL(:AEVPQQKDYVGMQNCPKDGLPVPVNIIWHFKEDGTIFHPHTRLCLS 549 



Query: 1709 AARKESSDSFVPLLRDCTNSD-HQKWPFKE 1795 

AR V++C D+QW F++ 

Sbjct: 550 AYRTAEGRPSVHM-KTCDALDKNQLWRFEK 578 

>gi|l934912|emb|CAA69875| (Y08564) UDP-GalNAc : polypeptide 

N-acetylgalactosaminyltransferase (Homo sapiens] 
Length =578 

Plus Strand HSPs: 

Score = 1617 (574.3 bits). Expect = 3.0e-165, P = 3.0e-165 
Identities = 322/570 (56%), Positives .= 399/570 (70%), Frame = +2 

Query: 128 VLLALLALAG LGSVLRAQRGAGAGAAEPGPPRTPRPGRR EPVMPRPPVPA 277 

+ LLA L +A L S A GAG A E G R + p+ +pp + 

Sbjct: 13 LLLAFLTVAYIFVELLVSTFHASAGAGR-ARELGSRRLSDLQKNTEDLSRPLYKKPPADS 71 

Query: 278 NAUSARGEAVRI^LQGEELRLQEESVRLHQINIYLSDRISLHRRLPXRVflNPLCKEKKYDY 457 

ALG G+A +LQL +EL+ QEE + + INIYLSDRISLHR + + CK +K++Y 
Sbjct: 72 RAIXSEWGKASKI^U^EDELKQQEELIERYAINIYLSDRISLHRHIEDKRMYECKSQKFNY 131 

Query: 458 DNLPRTSVIIAFYNEAWSTLLRTVYSVLETSPDILLEEVILVDDYSDREHUCERIANELS 637 

LP TSVIIAFYNEAWSTLLRT++SVLETSP +LL+E+ILVDD SDR +LK +L +S 
Sbjct: 132 RTLPTTSVIIAFYNEAWSTLLRTIHSVLETSPAVLLKEIILVDDLSDRVYLKTQLETYIS 191 

Query: 638 GLPKVRLIRANKREGLVRARLLGASAARGDVLTFLDCHCECHEGWLEPLLQRIHEEESAV 817 

L +VRLIR NKREGLVRARL+GA+ A GDVLTFL CHCEC+ GWLEPLL+RI E+AV 
Sbjct: 192 NLDRVRLIRTNKREGLVRARLIGATFATGDVLTFLYCHCECNSGWLEPLLERIGRYETAV 251 

Query: 818 VCPVIDVIDWNTFEYLGNSGEPQIGGFDWRLWTWHTVPERERIRMQSPVDVIRSPTMAG 997 

VCFVID IDWNTFE+ GEP IGGFDWRL F WH+VP++ER R S +D IRSPTMAG 
Sbjct: 252 VCPVIDTIDWOTFEFYMQIGEPMIGGFDWRLTFQWHSVPKQERDRRISRIDPra^ 311 

Query: 998 GLFAVSKKYFEYLGSYDTGMEVWGGENLEFSFRIWQCGGVLETHPCSHVGHVFRKQAPYS 1177 

GLFAVSKKYF+YIX3+YDTGMEVWGGENLE SFR+WQCGG LE HPCSHVGHVF K+APY+ 
Sbjct: 312 GLFAVSKKYFQYLGTYDTGMEWGGENLELSFRVWQOSGKLEIHPCSHVGHVFPK^ 371 

Query: 1178 RNKALANSVXAAEVWMDEFKELYYHRNPRARLEPFGDWERKQLRDKLQC^^ 1357 

R L N+ AAEVWMDE+KE +Y+RNP AR E +GD++ERK LR++L+CK F W+L+ V 
Sbjct: 372 RPNFLQNTARAAEVWMDEYKEHFYNRNPPARKEAYGDISERKLLRERLRCKSro 

Query: 1358 YPELHVPEDRPGFFGMLQNKGLTDYCFDYNPPDENQIVGHQVILYLCHGMGQNQFFEYTS 1537 

+P LHVPEDRPG+ G ++++G+r C DYN PD N G + L+ CHG G NQFFEYTS 
Sbjct: 432 FPNLHVPEDRPGWHGAIRSRGISSECLDYNSPDNNP-TGANLSLFGCHGQGGNQFFEYTS 490 



Query: 1538 QKEIRYNTHQPEGCIAVEAGMDTLIMHLCEETA PENQKFILQEDGSLFHEQSKKCVQ 1708 



KEIR+N+ ECV ++MC+ PN + +EDG++FH S C+ 

Sbjct: 491 NKEIRFNS-VTELCAEVPEQKNYVGMQNCPKDGFPVPANIIWHFKEDGTIFHPHSGLCLS 549 



Query: 1709 AARKESSDSFVPLLRDCTNSD-HQKWFFKE 1795 

AR V + RC D+QW F++ 

Sbjct: 550 AYRTPEGRPDVQM-RTCDALDKNQIWSFEK 578 

>gi|l0437274|dbj|BAB15027| (AK024865) unnamed protein product (Homo sapiens] 
Length =284 

Plus Strand HSPs: 

Score = 1547 (549.6 bits). Expect = 8.0e-158, P = 8,06-158 
Identities = 282/284 (99%), Positives = 282/284 (99%), Frame = +2 

Query: 953 MQSPVDVIRSPTMAGGLFAVSKKYFEYl/SSYDTCMEVWGGENLEFSFRIWQCGGVLETHP 1132 

MQSPVDVIRSPTMAGGLFAVSKKYFEYLGSYDTGMEVWGGENLEFSFRIWQCGGVLETHP 
Sbjct: 1 MQSPVDVIRSPTMAGGLFAVSKiaFEYLGSYDTGMEVWGGENLEFSFRIWQCGGVLETHP 60 

Query: 1133 CSHVGHVFRKQAPYSRNKAI^SVXAAEVWMDEFKELYYHRNPRARLEPFGDVTERKQLR 1312 

CSHVGHVF KQAPYSRNKALANSV AAEVWMDEFKELYYHRNPRARLEPFGDVTERKQLR 
Sbjct: 61 CSHVGHVFPKQAPYSRNKAIANSVRAAEVVMDEFKELYYHRNPRARLEPFGDVTERKQLR 120 

Query: 1313 DKLQCKDFKWFLETVYPELHVPEDRPGFFGMLQNKGLTDYCFDYNPPDENQIVGHQVILY 1492 

DKI^KDFKWFLETVYPELHVPEDRPGFFGMLQNKGLTDYCFDYNPPDENQIVGHQVILY 
Sbjct: 121 DKLQCKDFKWFLETVYPELHVPEDRPGFFGMLQNKGLTDYCFDYNPPDENQIVGHQVILY 180 

Query: 1493 LCHGMGQNQFFEYTSQKEIRYOTHQPEGCIAVEAGMOTLIMHLCEETAPENQKFILQEDG 1672 

LCHGMGQNQFFEYTSQKEIRYm'HQPEGCIAVEAGMDTLIMHLCEETAPENQKFILQETC 
Sbjct: 181 LCHGMGQNQFFEYTSQKEIRYNTHQPEGCIAVEAGMDTLIMHLCEETAPENQKFILQEDG 240 

Query: 1673 SLFHEQSKKCVQAARKESSDSFVPLLRDCTNSDHQKWFFKERML 1804 

SLFHEQSKKCVQAARKESSDSFVPLLRDCTNSDHQKWFFKERML 
Sbjct: 241 SLFHEQSKKCVQATiRKESSDSFVPLLRDCTNSDHQKWFFKERML 284 

>gi 15834600 I emb|CAA69876 I (Y08565) UDP-GalNAc: polypeptide 

N-acetylgalactosaminyltransf erase (Homo sapiens] 
Length = 622 

Plus Strand HSPs: 

Score = 40 (19.1 bits). Expect = 9.9e-120, Sum P(2) = 9.9e-120 
Identities = 8/14 (57%), Positives = 9/14 (64%), Frame = +3 

Query: 204 PSRDPRAPRAPGGA 245 

P +DP AP A G A 
Sbjct: 106 PPQDPNAPGADGKA 119 



Score = 1168 (416.2 bits). Expect = 9.9e-120, Sum P(2) = 9.9e-120 
Identities = 246/537 (45%), Positives = 338/537 (62%), Frame = +2 

Query: 233 PGRREPVMPRPPVPANALGARGEAV-RLQLQGEELRLQEESVRLHQINIYLSDRISLHRR 409 

P +P RPP NA GA G+A + + E + +EE + H N + SDRISL R 
Sbjct: 96 PAELKPPWERPPQDPNAPGADGKAFQKSKWTPLETQEKEEGYKKHCFNAPASDRISLQRS 155 

Query: 410 L-PXRWNPLCKEKKYDY-DNLPRTSVIIAFYNEAWSTLLRTVYSVLETSPDILLEEVILV 583 

LP PC ++K+ L TSVII P+NEAWSTLLRTVYSVL T+P ILL+E+ILV 

Sbjct: 156 I^POTRPPECVDQKFRRCPPIATTSVIIVFHNEAWSTLLRTVYSVIJOT'PAILLKEIILV 215 

Query: 584 DDYSDREHLKERLANELSGLPKVRLIRAMKRBGLVRARLLGASAARGDVLTFLDCHCECH 763 

DD S EHLKE+L + L VR++R +R+GL+ ARLLGAS A+ +VLTFLD HCEC 
Sbjct: 216 DDASTEEHLKEKLEQYVKQLQWRWRQEERKGLITARLLGASVAQAEVLTFLDAHCECF 275 

Query: 764 EGVrt,EPLLQRIHEEESAWCPVIDVIDWNTFEYIXaiSGEPQI---GGFDWRLVFTWHTVP 934 

GWLEPLL RI E+++ W P I ID NTFE+ ++ G FDW L F W T+P 

Sbjct: 276 HGWLEPLLARIAEDKTVWSPDIVTIDLNTFEFAKPVQRGRVHSRGNFDWSLTFGWETLP 335 

Query: 935 ERERIRMQSPVDVIRSPTMAGGLFAVSKKYFEYLGSYDTGMEVWGGENLEFSFRIWQCGG 1114 

E+ R + I+SPT AGGLF++ K YFE++G+YD ME+WGGEN+E SFR+WQCGG 

Sbjct: 336 PHEKQRRKDETYPIKSPTFAGGLFSIPKSYFEHIGTYDNQMEIVdGGENVEMSFRVWQCGG 395 

Query: 1115 VLETHPCSHVGHVFRKQAPYSRNKALA NSVXAAEVWMDEFKELYYHRNPRA R 1270 

LE PCS VGHVFR ++P++ K + N V AEVWMD +K+++Y RN +A + 
Sbjct: 396 QLEIIPCSWGHVFRTKSPHTFPKGTSVIARNQVRLAEVWMDSYKKIFYRRNLQAAKMAQ 455 

Query: 1271 LEPFGDVTERKQLRDKLQCKDFKWFLETVYPELHVPEDRPGFFGMLQNKGLTDYCFDYNP 1450 

+ FGD++ER QLR++L C +F W+L VYPE+ VP+ P F+G ++N G T+ CD 
Sbjct: 456 EKSFGDISERLQLREQLHCHNFSWYLHNVYPEMFVPDLTPTFYGAIKNLG-TNQCLDVG- 513 

Query: 1451 PDENQIVGHQVILYLCHGMGQNQFPEYTSQKEIRYNTHQPEGCIAVEAGMDTL-IMHLCE 1627 

EN G +I+Y CHG+G NQ+FEyT+Q+++R+N + + C+ V G L H 
Sbjct: 514 --Ea»JRGGKPLIMYSCHGLGGNQYFEYTTQRDLRHNIAK-QLCLHVSKGAI>6LGSCHFTG 570 

Query: 1628 ETA~PENQKF1LQEDGSLFHEQSKKCVQAARKESSDSFVPLLRDCTNSD-HQKWFF 1789 

+ + P+++++ L +D + + S C+ + K+ P + C SO HQ W F 

Sbjct: 571 KNSQVPKDEEWELAQDQLIRNSGSGTCLTSQDKK PAMAPCNPSDPftQLWLF 621 

>gi|3047191|gb|AAC13671| (AF031835) GLY5a; ppGaNTase iCaenorhabditis elegans) 
>pir|T42245|T42245 probable polypeptide 

N-acetylgalactosaminyl transferase (EC 2.4.1.41) - Caenorhabditis 
elegems 

Length = 623 



Plus Strand HSPs: 



Score = 1185 (422.2 bits). Expect = 1.8e-119, P = 1.8e-119 
Identities = 252/530 (47%), Positives = 326/530 (61%), Frame = +2 



Query: 251 VMPRPPV PANALGARGEAV— -RLQLQGEELRLQEESVRLHQINIYLSDRISLHRR 409 

V P P+ A G G+AV + +L EE ++ + + N Y SD IS+HR 

Sbjct: 97 VDPNDPIYKKGDAAQAGELGKAWVDKTKLSTEEKAKYDKGMLNNAFNQYASDMISVHRT 156 

Query: 410 LPXRmPLCKEKKYDYDNLPRTSVIIAFYNEAWSTLLRTVYSVLETSPDILLEEVILVDD 589 

LP + CK +KY+ +NLPRTSVII F+NEAWS LLRTV+SVLE +PD LLEEV+LVDD 
Sbjct: 157 LPTNIDAECKTEKYN-ENLPRTSVIICFHNEAWSVLLRTVHSVLERTPDHLLEEWLVDD 215 

Query: 590 YSDREHLKERIJ^LSGLP-K\nU.IRANKREGLVRARLIX3ASAARGD\nJTFIJX:HCE^ 766 

+SD +H K L +S KV+++R KREGL+RARL GA+ A G+VLT+LD HCEC E 
Sbjct: 216 FSDMDHTKRPLEEYMSQFGGKVKILRMEKREGLIRARLRGAAVATGEVLTYLDSHCECME 275 

Query: 767 GV^EPLLQRIHEEESAWCPVIOTIDVmTFEYI^NSGE-PQIGGFDWRLVFTWHTVPERE 943 

GW+EPLL RI + + WCPVIDVID NTFEY + +GGFDW L F WH++PER+ 

Sbjct: 276 GWMEPLUJRIKRDPTTWCPVIDVIDDNTFEYHHSKAYFTSVGGFDWGLQFNWHSIPERD 335 

Query: 944 RIRMQSPVDVIRSPTMAGGLFAVSKKYFEYLGSYDTGMEVWGGENLEFSFRIWQCGGVLE 1123 

R P+D +RSPTMAGGLF++ K+YFE LG+YD G ++WGGENLE SF+IW CGG LE 

Sbjct: 336 RKNRTRPIDPVRSPTMAGGLFSIDKEYFEKIiGTYDPGFDIWGGENLELSFKIWMCGGTLE 395 

Query: 1124 THPCSHVGHVFRKQAPYS-R NKALANSVXAAEVWMDEFKELYYHRNPRARLEPFGDV 1291 

PCSHVGHVFRK++PY R N NS+ AE\Atf+D++K YY R +L FGD+ 
Sbjct: 396 IVPCSHVGHVFRKRSPYKWRTGVNVLKRNSIRIAEVmiDDYKTYYYERINN-QI^ 454 

Query: 1292 TERKQLRDKLQCKDFKWFLETVYPELHVPEDRPGFFGMLQNKGLTDYCFDYNPPDENQIV 1471 

+ RK+LR+ L CK FKW+L+ +YPEL VP + M G C DY P 
Sbjct: 455 SSRKKIJlEDl^CKSFKVryLDNIYPELFVPGESVAKGEMRNAGGKNRQCIDYKPSG 509 

Query: 1472 GHQVILYIX:HGMGQNQFFEYTSQKEIRYm'HQPEGCIAVEAGMDTLIMHLCEETAPENQK 1651 

G V +Y CH G NQ++ + EIR + E C+ AG D ++ C NQ+ 
Sbjct: 510 GKTVGMYQCHNQGGNQYWMLSKDGEIR RDESCVDY-AGSDVMVFP-CHGMKG-NQE 562 

Query: 1652 FILQED-GSLFHEQSKKCVQAARKESSDSFVPLLRDCTNSD-HQKWFFKE 1795 

+ D G L H S+KC+ + + V C D +Q W FKE 

Sbjct: 563 WRYNHDTGRLQHAVSQKCLGMTKDGAKLEMVA CQYDDPYQHWKFKE 608 



